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Introduction 


-  1  - 

The  problem  of  estimating  the  tail  probability  l-F(x)  =  P(X>x)  of  a  r.v. 

X,  for  large  x  has  obvious  practical  importance,  for  example  where  large  values 

of  X  have  serious  or  catastrophic  implications  for  health  or  safety.  In  such 

cases  one  typically  has  limited  data,  so  that  nonparametric  procedures  often 

cannot  be  successfully  applied,  but  one  may  also  be  unwilling  to  fit  a  totally 

parametrized  distribution  over  the  entire  data  range. 

A  popular  compromise  with  wide  applicability  is  to  assume  that  the  tail 

“3c//5 

l-F(x)  decays  approximately  in  an  exponential  manner  e  K  as  x  — »  00  or  (by  log 
transformations)  as  an  approximate  inverse  power  law  in  the  sense  of  regular 
variation,  viz. 

(l.i)  —  x~a-  x  >  0 

for  some  index  a  >  0. 

Estimation  procedures  for  the  exponential  or  regular  variation  parameters, 

based  on  "high"  values  in  an  i.i.d.  sample  X^ . Xr  have  been  studied  by  a 

number  of  authors.  In  particular  the  so-called  "Hill-estimator"  (cf.  [10]  [7] 
[2]  [6]  [11]  [1])  is  based  on  the  upper  c^  =  o(n))  order  statistics  Xj11^ . 
UKcn.  having  the  form,  in  the  exponential  case 


and  in  the  regularly  varying  case  is  changed  by  using  log  X  instead  of  X.  One 
of  the  main  purposes  of  the  present  work  is  to  obtain  the  properties  of  this 
and  related  estimators  (and  in  particular,  asymptotic  normality)  if  the 
observations  are  no  longer  independent,  but  appropriate  long  range  dependence 
restrictions  are  assumed,  and  to  extend  these  results  to  estimation  of  tail 


-  2  - 


probabilities  and  quantiles.  We  are  grateful  to  T.  Hsing  for  sending  us  his 
concurrent  work  [9]  which  concerns  some  of  the  topics  considered  here  (i.e.  the 
Hill  estimate)  under  similar  mixing  conditions  but  using  more  detailed  and 
precise  local  dependence  assumptions  rather  than  univariate  tail  conditions  as 
here.  Our  principal  results  are  given  in  Sections  4  and  5,  following 
preliminary  general  central  limit  results  in  Sections  2  and  3. 

Section  6  proceeds  to  the  original  question  of  estimating  tail 
probabilities  l-F(x)  for  large  x,  and  tail  quantiles,  i.e.  the  (l-p)th  quantile 
of  F  for  small  p  values.  Asymptotic  distributional  results  are  obtained  for 
natural  estimates  based  on  the  tail  parameter  estimates  of  Sections  4  and  5. 

In  the  foregoing  results  conditions  on  the  dependence  structure  are 
obtained  so  that  properties  of  these  estimates  for  the  tail  of  the  marginal 
d.f.  still  hold  as  in  the  i.i.d.  situation.  It  can  also  be  of  interest  to 
estimate  tail  properties  involving  not  one  but  groups  of  the  r.v. ’s  Xj,  when  it 
must  be  expected  that  the  form  of  the  results  will  also  change  with 
introduction  of  dependence.  Such  a  case  is  also  discussed  in  Section  6  where 
tail  properties  of  the  maximum  =  max(Xj , . . . .Xpj)  of  N  consecutive  values  are 
considered.  In  cases  when  "local  dependence"  between  the  Xj  is  not  too  high 
the  tail  properties  of  the  maximum  are  the  same  as  in  the  i.i.d.  case.  However 
high  local  dependence  introduces  clustering  of  high  values  which  changes  the 
tail  properties  of  the  maximum  in  a  very  simple  way  depending  on  a  single 
parameter  0  known  as  the  "extremal  index"  of  the  sequence.  In  Section  6  the 
tail  parameter  estimates  are  combined  with  analogously  constructed  estimates  of 
0  to  give  estimates  of  tail  probabilities  and  tail  quantiles  for  the  maximum. 

The  discussion  in  Sections  3-6  involve  exponential  tail  decay.  In  Section 
7  the  modifications  needed  for  regularly  varying  tails  are  briefly  indicated. 

In  Section  8  the  methods  are  applied  to  data  consisting  of  tide  heights  at 
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a  station  on  the  Dutch  coast.  Estimates  are  obtained  for  both  tail  parameter 
and  extremal  index  with  various  choices  for  the  number  of  upper  order 
statistics  used.  The  effect  of  these  choices  provides  some  insight  into  the 
properties  of  the  procedure.  In  Section  9  simulations  are  carried  out  for  two 
processes  with  exponential  tails  (i.i.d.  and  moving  average  sequences)  and  two 
with  Pareto  tails  (moving  average  and  autoregressive  sequences).  The 
simulations  show  that  convergence  is  somewhat  slow,  and  that  it  might  be  of 
interest  to  investigate  "higher  order  approximations".  Nevertheless,  the 
present  methods  certainly  seem  sufficient  for  many  engineering  problems  where 
ample  data  is  available,  such  as  the  water  levels  from  Section  8. 

Finally  in  this  introduction  we  note  the  precise  form  of  the  strong  mixing 
assumption  to  be  used  throughout  for  the  (strictly)  stationary  sequence 

Xj,X2 .  Write  3^  for  the  a-field  a{X^:  i£k£j)  generated  by  X^,X^+j . X^ 

and  for  fixed  n.  £<n. 


ang  =  sup(|P(AnB)-P(A)P(B)|:  A  €  ^  k>  Be  \+g  ^  l^n-«) 

Then  (X.)  will  be  termed  "strongly  mixing  (a  )"  if  a  .  — *  0  for 

j  n » v  n  n  •  c 


some 


^n=o(n).  It  may  be  shown  that  the  existence  of  such  a  sequence  {^R}  follows  if 
an  en  — *  ®  as  n  — *  09  ^or  ea°h  £  >  0.  This  "array  form"  of  strong  mixing  is  of 
course  implied  by  the  standard  definition  (in  which  k+£  is  not  restricted  to 
values  no  larger  than  n) .  For  particular  purposes  weaker  forms  of  the 
condition  may  be  used  -  replacing  3^  by  the  a-field  generated  by  the  functions 
of  Xj.X^j . Xj  relevant  to  the  problem  (such  as  1^  ^  or  (X^-un)+  for 

given  u).  However  in  the  present  context  this  is  unlikely  to  achieve  a 
significant  reduction  of  conditions  and  we  simply  assume  the  above  full  strong 
mixing  condition. 
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2.  Notation,  assumptions,  and  general  results. 

It  will  be  assumed  throughout  that 

(i)  {Xj}  is  stationary  (with  marginal  d.f.  F) ,  strongly  mixing 
<“n.e-  {n> 

(ii)  integers  k  -»  00  are  chosen  such  that  k  (a  „  +2  /n)  -»  0.  Write 

n 

r  =  [n/k  ].  Hence,  in  particular,  2  =o(r  ). 
n  L  nJ  n  v  ny 

(iii)  Integers  Cfi  -*  00  and  "levels"  ur  are  chosen  with  (l-Ffu^))  ~  cr/n 

(iv)  f  is  a  left-continuous  function  on  the  positive  real  line 

R+  =  [0,«),  of  bounded  variation  on  finite  ranges,  and  such  that 

*(0)  =  0.  «*2(Xj)  <  «. 

The  conditions  (i)  -  (iv)  will  be  referred  to  as  the  Basic  Assumptions . 
Other  assumptions  will  be  made  as  needed  and  stated.  For  example  the  condition 

(2.1)  cn=o(kn) 

will  also  occasionally  be  used  (when  stated).  While  the  main  results  will  be 
proved  without  this  assumption,  its  use  leads  to  simplification  of  sufficient 
conditions. 

Write 

*£  *  r  +«V“A> 

n  i=l 

p  =  «/3*  =  —  «*((X.-u  )J- 

n  n  c  v  1  n'+' 

n 

A  primary  aim  of  this  section  is  to  show  that 

(2.2)  (cn/Xn)*  (P*~Pn)  *  N(0.1) 

where 

k  r 

(2.5)  X  =  -^var{  l"  *((X  -u  )  )} 
n  cn  j=l  J  n 

it 

under  appropriate  conditions  on  F  and  ^ .  The  dependence  of  the  Pn  on 
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unknown  underlying  parameters  restricts  their  practical  usefulness  as 
estimators.  However  it  will  be  seen  that  the  result  (2.2)  is  basic  in 
providing  asymptotic  distributional  results  for  natural  estimators  formed  by 
modifying  /3*. 

The  proof  of  (2.2)  will  be  carried  out  by  splitting  the  sum  for  Pn  into 

groups  which  may  be  assumed  independent,  and  applying  the  Lindeberg  Central 

Limit  Theorem.  Write  m  =  n-k  r  and  define  "intervals”  J, .  J0  ...  J,  to  each 

n  n  n  X  ^  K 

n 

consist  of  r^  consecutive  integers,  the  first  tn^+1  being  separated  by  one 
integer,  and  the  remainder  abutting,  i.e., 


Jt  =  ((i-l)rn  +  i.  (i-l)rn+i+l .  irn+i-l),  1  £  i  $  rr>n 

=  ((i-l)rn+mn+l.  (i-l)rn+mn+2 . .  mn  <  i  $  kn 

Let  Jj  denote  the  first  (rn~^n)  integers  in  Jj ,  1  £  i  i  kn,  and  write 
YJ  l=  Vn.J>  *  <V“nL- 

Zi  <=Vi>  *  'Vn>^  «Yj>'  1  ‘  1  <k„ 


-M 


Ui  <=  Vi>  *  <Vn>  J,-  ^j) 


J6J, 


vi  <-Vi)  =  zi  -ut 


W,  (=  W  ,)  =  (»  c  )■*  +(Y,, 
i  v  n.i'  v  n  n'  TV  i(rn+l)' 


1  £  i  i  m 


K  m  n 

Then  I  Z,  +  r  I  =  (X  c  )  2  ^(Y  )  so  that  (2.2)  may  be  rewritten  as 

jiji  nn  j  j 


m 


(2.4)  211  ( Z.SZ .)  +  f1  (W  -«W.)  3  N(0, 1) 

1  x  l  1  l  i 
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In  the  following  and  throughout  undesignated  ranges  for  sums  and  products  are 

to  be  taken  form  1  to  k  . 

n 

Lemma  2.1.  Assume  that  the  basic  assumptions  hold  and  also 

(2.5)  k  (varV  -  +  varW  . )  -*  0 

v  '  nv  n.l  n.l' 

Then 

(2.6)  (i)  (V  -«V.)  So  (ii)  f1  (W  -«W.)  So. 

1  1  1  1  1  1 

Further  it  then  follows  that  in  proving  (2.4)  the  second  sum  may  be 
omitted  and  the  r.v. ’s  in  the  first  sum  assumed  independent.  More  specifically 
under  the  above  conditions,  (2.4)  (and  hence  (2.2))  hold  if  and  only  if 

(2.7)  S  N(0.1) 

rn 

with  Z.  assumed  i.i.d.,  Z.  *Z.  (=  (A  c  )”**  2  ^(Y.)). 

1  *  1  II  II  j  ^  J 

Proof •’  Since  the  ^  are  defined  by  groups  of  Xj  which  are  (for  large  n) 
separated  by  at  least  (rn/^n  — ►  00  by  (ii)  of  the  basic  assumptions),  it 
follows  by  a  standard  induction  on  the  mixing  condition  (cf.  [12])  that 

|l{exp(it2(VJ-*Vj))}  -  ff«{exp(it(Vj-IVJ)))  $  16kan  g 

which  tends  to  zero  as  n  ■>  ®.  Hence  in  showing  (2.6)  (i)  it  may  be  assumed 
that  the  terms  are  Independent.  But  with  this  assumption  the  variance  of  the 
sum  is  kn  var  ^  -*  0  by  (2.5)  so  that  (2.6)  (i)  holds.  The  proof  of  (2.6) 
(ii)  is  entirely  similar. 

It  follows  at  once  that  (2.4)  holds  if  and  only  if 


(2.8)  2( Zj-SZj) 


d 


N(0.1). 
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Now,  to  prove  that  (2.7)  implies  (28),  let  (U^.V^)  be  pairs  having  the 
same  distribution  as  (U^.V^)  but  being  independent  for  1  £  i  i  kn-  If  (2.7) 

AAA 

holds,  it  holds  with  the  specific  choice  and  since  clearly  (2.6), 

(i)  holds  with  V4  replaced  by  it  follows  that  2(U^-^Ui)  ^N(0,1).  But  again 

|«{exp(it2(Uj-«Uj)}  -  H5{exp(it(Uj-SU^))}  |  *  16  kR  an  g  -0 


and  in  the  second  (product)  term  may  be  replaced  by  (=  U^)  so  that 
^(U^-iUj)  ^N(0,1).  Finally  since  ZZ^  =  IUj  +  it  follows  from  (2.6)  that 

(2.8)  and  thus  (2.4)  and  finally  (2.2)  hold.  Thus  (2.7)  implies  (2.2).  The 
converse  is  similarly  shown  by  simply  reversing  the  chain  of  arguments. 


This  lemma  leads  at  once  to  a  preliminary  but  useful  form  of  the  main 
result. 

Theorem  2.2.  Suppose  that  (2.5)  holds,  in  addition  to  the  basic  conditions. 
Then  (2.2)  holds  if  and  only  if  the  Lindeberg  condition 


(2.9)  k,  ,((Zn  r«n  ,)2  ^ 


-»  0  as  n  -»  00 .  each  e  >  0, 


is  satisfied. 

Proof :  This  is  immediate  from  Lemma  2.1  and  the  Lindeberg  Central  Limit 

Theorem  since  k  var  Z  ,  =  1.  ■ 

n  n,  1 


Finally  in  this  section  we  show  that  (2.1)  provides  a  simple  sufficient 
condition  for  (2.5).  Less  restrictive  sufficient  conditions  will  be  given 
later  when  exponential  decay  is  assumed. 
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Lemma  2.3.  If  <^(x)  £  0  all  x  and  (2.1)  holds  (i.e.  cn  =  o(kn))  in  addition  to 


the  basic  conditions,  then 


-  kn 


(2.10)  var  Z  .  ~  6Z  ,  so  that  X  ~  X  =  —  &{  2  >^(Y  .)>  . 

v  '  n.l  nl  n  n  c  '  «  j" 

n  j=l 


and  (2.5)  holds. 


Proof!  («{/  ^(Yj)}]2  =  (Wj))2  =  r^{« ^W^O)^ 


S  rn  p{Yi>0>  =  r2(l-F(un))  «*2(Yj) 


r 

c  n 
n  •/« 


*  K  rn  t  ^Yi » 


since  1-F(u  )  ~  c  /n  and  4i( Y.)  £  0,  each  i.  Since  k  r  ~  n,  it  thus  follows 
v  n'  n  TV  i'  n  n 


(«2  *(Y  ))2  =  o{«(2  *(Y  ))2} 
1  J  1  J 


by  (2.1)  which  yields  (2.10). 


[r  /€  ] 
*■  n  nJ 


Since  clearly  Z  ,  £  2  V'  where  V'  =  V  .  it  follows  that 

n.l  n.i  n.i  n,l 

«Z2  .  l  [r  /e  ]  SV2  .  and  hence 
n,  1  L  n  nJ  n.l 

ic  k  g 

(2.11)  k  var  V  ,  £  ~  r  SV2  .  i  K  —  C  SZ2  ~  K  — 

v  ’  n  n.l^r  n  n.l  rnn.l  r 


by  (2. 10)  since  var  Z  .  =  1/k  .  But  i  /r  ~  k  t  /n  -*  0  so  that 
-7V  '  n.l  n  nnnn 

k  var  V  ,  0.  Similarly  k  var  W  ,  -»  0,  showing  (2.5). 

n  n.l  Jn  n.l 


3.  Exponentially  decreasing  tails. 

To  obtain  more  detailed  results  we  assume  the  following  exponential-like 
rate  of  decay  for  the  tail  l-F(x)  of  F: 
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(3.1)  (l-F(t+x))/(l-F(t))  -»  e_x/^  as  t  — »  »,  all  x  l  0.  some  0  >  0. 

Except  for  the  final  theorem  of  this  section  it  will  be  assumed  that  the 

function  ^(x)  is  non-negative  and  nondecreasing.  We  shall  refer  to  the 
Augmented  Basic  Assumptions  to  indicate  the  addition  of  these  conditions. 

Lemma  3.1.  If  the  Augmented  Basic  Assumptions  and  (3.1)  hold,  and  if 

(3.2)  Jq  e^e  ^  d^»(x)  <  »,  some  e  >  0, 

then 

(3.3)  Wn  l)  -  X~  e_X/P 

Proof  =  S'KYj)  =  X“  Hx~un)  dF(x) 

n 

=  Xq  *(x)  dP(un  +  X) 

=  Xq  (1_F(un  +  x)  d,Hx) 

~  (1-F(«n))  Xq  e_X/P  d*(x) 

by  Theorem  1.8  (ii)  of  [5].  The  result  then  follows  from  (iii)  of  the  Basic 
Assumptions.  ■ 

Note  that  this  result  of  course  holds  if  it  is  assumed  just  that 
g|^(Yj)|  <  «  rather  than  t'p  (Y^)  <  »  in  the  Basic  Assumptions.  Using  this, 
the  lemma  yields  the  following  simple  but  useful  facts. 

If 

Lemma  3.2.  Under  the  assumptions  of  Lemma  3.1,  with  An  as  in  (2.10), 

(3.4)  k„  ezn  l  ~  *(cn/Xn)l<.  A  =  x"  e_x/(i  d+(x) 

<3  5>  k„  <l  -  1  +  *2xT  0*0(1))  X  K(1  *  ^S-). 

n  n  n  n 

2 

If  also  (3.2)  holds  with  ^  replacing  then 
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(3.6)  lim  inf  X*  =  lim  inf  ^  g(2  *(Y.))2  *  jJ  e  ^  dV»2(x) 

n  1 


and 


c  £ 


(3.7)  k  gW2  ,  i  k  gV2  ,  £  K  minf-V1-  — r~)  +  o(l) 
v  '  n  n,l  ■*  n  n,l  ■*  vnX  r  X  '  v/ 

n  n  n 

Proof :  (3.4)  follows  at  once  from  (3.3)  since  rn  ~  n/k^,  (3.5)  is  obtained 

from  (3.4)  by  noting  that  var  =  l/kn. 


n 


If  *(x)  *  0  then  g(2  W s))  l  rn  gyp  (Y^) 


so  that 


lim  inf  X**  £  lim  inf  k  c  *r  t\p2( Y1)  ,  giving  (3.6)  by  (3.3)  with  yp2  for  yp. 
n  nnn  1 

Finally  as  in  the  proof  of  Lemma  2.3,  by  (2.11),  and  using  (3.5), 


k  gW2,  ^  k  gV2  ,  £  K  ^2J1  gZ2 
nnl  n  n.l  ■*  r 

n 


;.i  <*??■+ 

n  n 


c  £ 
n  n 


since  £  /r  ~  k  £  /n  -»  0.  The  second  bound  k  gV2,  1  K  g2/(r  X  )  follows  at 

n  nl  r  v  "  " 


n  n 


n  n 


n  o' 


once  from  the  obvious  (Minkowski)  inequality  gV2  ,  £  g2g*2(Y  ,  )/(c  X  )  and 

n,  1  n  nl  n  n' 


(3.3)  with  'p  in  place  of  yp. 


The  conditions  (2.5)  used  in  Theorem  2.2  may  be  readily  verified  directly 
in  particular  cases.  However  simple  sufficient  conditions  are  obtainable  from 

(3.7) ,  viz  either 

(3.8)  c«  /  (nX  )  -»  0 
v  1  n  n  v  n' 

or 

(3.9)  «2/(r  X  )  ->  0 
v  ’  n  v  n  n' 

The  (more  useful)  condition  (3.8)  is  implied  in  particular  by  (2.1),  viz 

cn=o(kn),  by  Lemma  2.3,  (3.6)  and  the  basic  assumption  kngn/n  -»  0. 

To  give  simple  sufficient  conditions  for  the  Lindeberg  criterion  it  is 

convenient  to  truncate  \p( Y.)  as  follows.  For  constants  w  to  be  specified, 

v  j'  n 
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def ine 


Yj  =  Yj  +  Wn  1(Yj>wn)  ’  1  “  j  ^  n 


n 


zi  -  KV  jfj  «Yj> 

Lemma  3.3.  Let  the  Augmented  Basic  Assumptions  and  (3.1)  hold,  and  for  some 
0  <  e  <  1,  let  satisfy 

(3.10)  J®  exp{(e-/3  ^xjd^x)  —*  0  as  n  — *  «. 

n 

Then  k  — ►  0  as  n  — *  00 . 

n  '  1  1' 


Proof*  Note,  using  Minkowski’s  Inequality,  that 

r_2c  A  «(Z,-Z’)2  =  t~2S{  2  (*(Y.W(w  ))l,v  .  J2 
n  n  n  v  1  1'  n  .  v  v  y  v  n' ’  (Y.>w  )' 

j=l  v  j  n' 

*  *«+<Yl>  -  '«\»V >w  ,> 

v  1  n-' 

*  *(*<Yl)2  -  +<*n)2)1(Y,>»  )> 

'In7 

=  j:  +w  ('Kx“un)2“'Kwn)2)  dF(x) 
n  n 

=  J*”  (l-F(y+un))  d^2(y) 

n 

<,  (l+e)(l-F(un))  J*®  exp[(e-/3_1)x]d^2(x) 

n 

by  Proposition  1.7  of  [5],  from  which  the  desired  conclusion  follows  by  (3.10) 
since  1-F(u  )  ^  c  /n  and  r  k  ^  n. 

v  n'  n  n  n  1 


The  following  theorems  are  now  simply  obtained. 

Theorem  3.4  Let  the  Augmented  Basic  Assumptions,  (2.5),  (3.1),  (3.2)  all  hold, 
and  let  wr  satisfy  (3.10)  and 

(3.11)  (c  A  )~^  r  ^(w  )  -»  0. 
v  ’  v  n  n'  n  v  n' 
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Then  (2.2)  holds,  i.e.  (cn/An)*  (P*~Pn)  ^  N(0,1). 


Proof :  By  Theorem  2.2  it  is  sufficient  to  show  that  the  Lindeberg  Condition 
(2.9)  holds.  Now  if  X.Y  are  any  two  random  variables,  it  is  readily  checked 


that 

(3.12) 


(X+Y)2  |X+Y  |^fe)  *  4()(21  (IX^e/2)  +  |Y  1^(2))^ 


from  which  it  follows  (with  Z^-SZ'  for  X  and  (Z^-SZ^)  -  (Zj-SZj)  for  Y)  that 

V((V*Z1>  1(|Z1-IZ1|>e)^  *  4kn4((Zi“£Zi)2l(|Z'-IZ' |>e/2)}  +  4knl(Zl'Zi)2- 


The  first  term  on  the  right  tends  to  zero  trivially  since 
—Vi 

Office  A  )  r  ^(w  )  — »  0  by  (3.11),  and  the  last  term  tends  to  zero  by  Lemma 
1  n  n  n  n 

3.3,  so  that  (2.9)  holds,  as  desired.  I 


In  the  final  result  of  this  section  we  generalize  Theorem  3.4  to  include 
functions  </»(x)  which  can  be  negative  and  not  necessarily  monotone. 

Theorem  3.5  Suppose  the  assumptions  of  Theorem  3.4  are  satisfied  for  each  of 

the  functions  ^(x),  write  ^(x)  =  ctj^j(x)  +  a^p^(x) .  a^,  a ^  (positive 

or  negative)  constants.  Let  A^1^,  A^2^,  Ar  be  defined  as  in  (2.3)  relative  to 

'p.  ,  'Pa.  'P  respectively,  and  suppose  that  A(k^  £  KA  k=l,2,  n=l,2,3 .  Then 

i  n  n 

K  _  1  ®  _  1 

(2.2)  holds,  i.e.  P  =  c  2  ^(X  -u  )  ,  P  =  n  c  ^(X.-u  )  ),  satisfy 

n  n  ^  x  n  »  rx  n  x  n  * 

‘VV*  «>>„>  *  "(O’1)- 

Proof:  If  0*(k)  =  c_1  2  4i»X.-«  1  L  0(k^  =  «0*(k^ .  k=1.2  then  Theorem  3.4 

—  —  n  n  kvv  i  n/+/  Kn  n 

shows  that  (cn/A^k^)^  (P^(^^“P^^)  ^N(O.l)  and  hence  by  Theorem  2.2  the 
Lindeberg  conditions 
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(3.14)  kn  8{(z‘kl  -  <Z<k>)2  l(|z(k)-,z(k)|^}>  -0  as  n  each  4  >  0 

n  ( i  n  •  a 

hold,  where  Z*k]  =  (X^c  )"*  2  (Y .) ,  k  =  1,2. 
n,i  n  n  J€J  K  j 

(k) 

Since  Xv  ’  £  KX  this  Lindeberg  Condition  continues  to  hold  for  each  k  = 
n  n 

1.2  if  X<k>  is  replaced  by  X  in  the  definition  of  Z^k)  and  hence  it  holds  for 
n  n  n,  1 

a j  Z^|  +  Og  Z^j  by  the  inequality  (3.12).  The  remaining  conditions  of 
Theorem  2.2  regarding  are  readily  checked,  giving  the  stated  result.  ■ 

4.  The  Hill  Estimator 
Let 

n 

N  (x)  =  2  l/v  .  , 
n  i=l 

be  the  number  of  exceedances  of  x  by  Xj . X^,  and  let  (zn)  be  a  sequence  of 

"levels",  non-random  or  random.  The  Hill  estimator  p^  is  then  defined  by 

A  A  1  ^ 

P  =  p  (z  )  =  JJ-7 — 7  2  (X.-z  )  . 
n  Knv  n'  \(zn)  i=1  1  n  + 


The  two  cases  which  mainly  have  been  considered  are  zn  =  un-  w*th  (un)  a  given 

non-random  sequence,  and  z  =  the  c  -th  largest  of  X, . X  ,  with  (c  }  a 

n 

given  non-random  sequence  of  integers.  This  leads  to  the  two  estimators 

-  ,  " 

(4.1) 
and 


P  (u  )  =  77-7 — 7  2  (X.-u  1 

Knv  n ’  N  (u  )  .  t  v  i  n'+ 
nv  ny  i=l 


P  (X(n))  =  —  (X.  -  X<n)). 

nv  c  /  c  .  -  v  i  c  '+ 


n 


c  s  i  '  i  c 
n  1=1  n 


=  —  2°  (X<n>  -  X<n>) 
c  .  .  l  c 

n  i=l 


n 


(4.2) 
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For  the  present  purposes  a  somewhat  stronger  tall  condition  than  before  is 
needed.  We  suppose  F  has  one  derivative  F'  which  satisfies 


(4.3) 


F*  (t) 

lim  - 

t-*»  l-F(t) 


Further,  for  (u  }  (or  (c  })  given,  define  (c  }  (or  {u  })  by 
n  n  n  n 


(4.4) 


c  (1-F(un))  =  I- 

n 


and  assume  throughout  that  c^  -»  »,  cn/n  -»  0.  Here,  if  u^  is  given,  the  c^ 
obtained  from  (4.4)  may  not  be  an  integer.  However,  in  that  case  we  replace  cy 
by  its  integer  part.  It  is  straightforward  to  check  that  this  does  not  affect 
the  proofs  below. 

We  will  prove  that  the  estimators  in  (4.1)  and  (4.2)  are  asymptotically 
normal .  wi th  means 

(4.5)  £  =  —  £(X.  -  u  1 

v  J  Kn  c  v  1  ny+ 

n 

=  ^{(Xj  -  un)|X1  >  un). 

It  follows  from  (3.1)  (which  in  turn  is  implied  by  (4.3),  cf.  Lemma  A2  of  the 
appendix),  as  in  Lemma  3.1,  that 


(4.6) 


P  •*  P  <  »  . 

n 


F(x)  =  l-F(x)  =  SI 


{Xj>x} 


and  define,  with  notation  similar  to  that  in  Section  2, 


(4.7) 


\  =  rVar  *  «Xj  "  Un>+  -  P  V^u  }> 
n  j=l  1  j  n' 


Further,  let 
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S  fx)  = 


Nn(x)  ~  n  F(x) 


n 


X  c 
n  n 


En(x)  =  0n(x)  -  Pn 


Lemma  4. 1  (i)  Suppose  (4.3)  holds,  cn^n  00 • 


(4.8) 


[?  If  <w*  -  »„]  -  p  w 

^  n  v  n  i=l  7 


and  either  the  first  or  the  second  term  in  (4.8)  is  tight  (so  that  both  are 
tight).  Then 


N  (u  )  c  r  .  n  -« 

(4.9)  £  (u  )  -  <  —  2  (X  -  u  )  -  0  - 

v  7  X  nv  n7  X  I  c  .  .  v  i  n7+  'll) 

s  n  n  n  i=l  7 


P  S  (u  ) 
nv  n7 


and 

(4.10) 

(ii)  If  furthermore 

(4.11) 

and 

(4.12) 
then 


N  (u  )  j 

-V2-  En(un)  J»(0,1). 


so  that  also 


n 


)T  ^zn  “  un)  is  tl8ht- 

n 


S(z)-S(u)50. 
nv  n7  nv  n7 


N  (z  ) 
n'’  n7 


n 


E  (z  )  - 
nv  n7 


r.  ,  .  P  „ 

— -  En(un>  ■*  °- 

n 


«n(*n)  d 

-V2-  En(zn)  aN(0,l). 


n 


Proof .  Since  cn  =  n  F(un),  by  (4.4),  we  have  that 
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(4.13) 


r—  E  (u  ) 
,  X  nv  n' 
s  n 


1_ 

cn 


n 

2 

1=1 


un>+ 


-  S  (u  )  p  (u  )  . 
nv  n7  nv  n7 


Further,  since  c  /X  -*  ®,  fJ  -»0t  and  the  two  terms  in  (4.8)  are  tight,  it 
n  n  n 

follows  that 


(4.14)  —  H  (u  )  S  1,  —  2  (X,  -  u  1  S  p 

v  7  c  nv  n7  c  .  ,  i  n7+  K 

n  n  i=l 

A  p 

and  hence  also  Pn(un)  "*  P-  Now  (4.9)  and  hence  (4.10)  readily  follow  using 

(4.8),  (4.14)  and  tightness  of  Sn(un). 

It  is  obvious  that  if  (3.1)  holds,  then  F(z  )/F(u  )  -»  1  if  z  -u  -»  0. 

v  7  v  n7  v  n7  n  n 

However,  since  (4.3)  implies  (3.1)  by  Lemma  A2  of  the  appendix,  this  follows 
from  (4.3).  Similar  arguments  to  those  above  now  show  that  if  in  addition 
(4.12)  holds,  then 


f-N  (zj  5  1. 
c  n'  n' 
n 


(4.15) 

Thus,  to  establish  the  rest  of  the  lemna,  it  is  enough  to  show  that 


c 

n 

X 

n 


5T  <W  -  w> 5  »• 

n 


We  will  first  bound  this  expression  on  the  set  (z^  >  u^}.  Formally,  this 

may  be  done  by  multiplying  by  1,  ^  ,  throughout,  but  for  simplicity  of 

*  n  n7 

notation  we  will  just  assume  z  >u  in  the  computations  below.  Then, 

n  n 

n  n 

2  (X-u)  =  2  (X.-z  )  +  (z  -u  )  N  (z  )  +  r  , 

v  i  n7+  v  i  n7+  v  n  n7  nv  n7  n 


n 


for  r  =  2  (X-u  )  1 

n  v  * 


i  ‘M,<=)  • 


1=1  '  "  n' 

By  straightforward  computations, 
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r  <W-W>  --  iOFT  <VU„>  -  sn(2r,»  W 

n  n  n 

“  f  »(F(u„)-F(z J)  -  1  rn 

♦  J  t  w  +  vz”  J  J\W 

=  A  +  B  +  C  .  say. 

n  n  n  ^ 

It  follows  directly  from  (4.15).  (4.12)  and  P^uJ  S  p  that  Ar  -*  0  as  n 
Further,  since  F(x)  =  l-F(x) .  it  follows  from  (4.3)  that 


.  P<z»)  frfUn  K*s)  ds 

108  ?^T7  ’  l 


-  -<Vun)/@  +  0<z,Tun)  ' 

According  to  Taylor's  formula.  1-x  =  -log  x  +  o(log  x) .  as  log  x  -»  0,  and 


hence,  using  (4.4)  and  F(zn)/F(un)  -*  1 


f  ^  1 

n)  '  F<Zn»  *  «o  ‘  -  i-T  J 


(4.16) 


n(F(u 


F<V 


-  c»(Vun)/P  +  °<cn(Vun”  ' 


Thus, 


Bn  =  ("n^n’"1  tcn(V"n)/P  +  o(cn(Vun>]Pn(un>  *  Vzn> 

I  c  C  P  (u  )  p 

■  r  <W  TOTJ-  (I  +  °(1»  - 1  ;■  ->  0  ■ 

n  l  n  n 

-  P 

by  (4.11)  and  (4.15).  since  Pn(^n)  -*  P- 

Finally,  to  show  that  C  £  0  it  suffices  by  (4.15)  to  prove  that 


(4.17) 
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From  (4.11),  it  follows  that  we  may  choose  z'  non-random,  with  P(z‘  >  z  )  -*  1 
v  '  n  n  n 

1/4 

and  (c  /X  )  (z'  -  u  )  -*  0.  Since  r  is  increasing  in  z  ,  it  is  hence  enough 

v  n  n'  v  n  n7  n  n 

to  prove  (4.17)  with  zr  replaced  by  z\  It  then  follows  from  (4.16)  that 


X  c 
n  n 


(zn-un)  n(F(un)  -  F(zn) 

I  X  c 
s  n  n 


■°[  3T  <*n  -  un)2  ]  -  0 


as  n  -*  »  . 


Hence  also  C  -»  0,  and  the  desired  conclusion  holds  on  the  set  (z  >  u  }. 

n  1  n  nJ 

Similar  considerations  on  (z  <  u  }  then  conclude  the  proof. 

1  n  nJ 


It  seems  likely  that  under  conditions  similar  to  those  in  Section  2,  the 
sequence  of  processes  (Sn(un  +  x) ;  |x|  i  1}  is  tight,  and  has  a  continuous 

limit  in  D[-l,l],  In  that  case  (4.12)  would  hold  for  any  sequence  {zr}  with 

P  ( 

z  -u  -»  0.  However,  here  we  will  consider  only  z  =  X1  ' 
n  n  n  c 

n 

Lemma  4.2  Suppose  Sn(un)  ^  Z,  for  some  random  variable  Z,  and  that  (4.12) 
holds  for  any  non-random  sequence  {z^}  with  Jc^TX^  (zn-un)  bounded.  Then 


(4.12)  holds  also  for  z  =  i.e. 


(4.18) 


S  (X^)  -  S  (u  )  S  0 
nv  c  ’  nv  n' 
n 


(4.19) 


r— —  (X^  -  u  )  -  PS  (u  )  £  o. 
X  v  c  nJ  K  nv  n' 
n  n 


Proof .  By  definition  N  (X^n^)  =  c  and  hence 

n 
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C  -  nF(X<“>) 

S(X<"))= 
nv  c  ’ 
n 


c  ^ 
n  n 


Let  U  =  (1/(1-F))  be  the  right  continuous  inverse  of  1/(1-F)  =  1/F  and  set, 
for  x  €  1R, 


z  =  U(  —  - - -  ) 

n  v  c  -  ’ 


n  1  -x 


so  that 


r(n), 


(4.20)  {  S n(X^“M  S  x  }  =  {  Xy* '  $2J. 


X  /c 
>1  n  n 


r(n) 


n 


c  n 

n 


Also  u  =  U(n/c  )  and  hence 
n  v  ny 


<-l/(l-xl  X  /c  )  r  , 

(z  -u  )  =  |  4  n  "  s-  S  U'  I  S-  s  1  ^  . 

n  n  Jj  cn  l  cn  J  s 

By  (i)  of  the  remark  after  Lemma  A2  of  the  appendix,  (4.3)  implies  that 

5rsU‘(;rs]'>p*  n “>0°* 

n  L  n  J 

uniformly  for  s  in  the  considered  range,  and  thus 

jlF  (Vun>  ~  3T  p(  -log  (1  -  x  ^  )  ]  • 


which  clearly  is  bounded,  so  that  (4.12)  holds  for  this  X 


(n) 


n 


By  (4.20)  and  the  definition  of  X 


(n) 


n 


i  Vxcn)>  < * 

n 


>  =  (N  (z  )  <  c  } 
1  1  nv  n'  n' 
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S  (z  )  i 
nv  n' 


c  -n  F(z  ) 
n  v  n' 

c  X 
■4  n  n 


-  <  W  *  *>  • 


Since  Sn(un)  ^  Z  it  follows  from  (4.12)  that 

(SfX^).  Sfu  ))  $  (Z,  Z)  . 


n'  c 


n 


n'  n-’ 


which  implies  Sn(X^n^)  -  S^fu^)  ^  Z-Z  =  0.  proving  (4.18). 


n 


n'  n 


To  prove  (4.19),  set 


z  =  u  +  xVX  /c  , 
n  n  n  n 


so  that 


JL.  rv(")  _ 


r(n) 


*-  <Xc  '  ~  Un>  *  X>  =  IK  '  *  ZJ  =  <M*J  <  CJ 


c 

n  n 


c  -  n' 
n 


n'  n'  n' 


=  < 


/3S  (z  )  £  - 2 -  (C  _n  F(z  )) 

nv  n'  i—? -  v  n  K  n' ' 


I  X  c 

'i  n  n 


Here  cr  =  n  F(un),  by  (4.4),  and  it  follows  from  (4.16)  that 


- ^ -  (c  -n  F(z  ))  ~  - ^ —  c  (z  -u  )/P  =  x  . 

-  v  n  '  n/J  i— r -  nv  n  ny  K 

X  c  X  c 

■4  n  n  -4  n  n 


Since  0Sn(zn)  converges  in  distribution,  reasoning  as  in  the  first  part  of  the 
proof  shows  that  (4.19)  holds.  I 


Asymptotic  normality  of  the  Hill  estimators  (4.1)  and  (4.2)  now  follows 
from  the  results  of  Sections  2  and  3. 

Theorem  4.3  (i)  Suppose  (4.3)  holds,  c^/Xn  “*  <x>.  and  the  conditions  of  Theorem 
3.5  are  satisfied  for 
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^(x)  =  x  =  ^x^o}' 

Then 


(4.21) 


N  (u  ) 
nv  n ' 


En(un)  ^N(O.l)  . 


(11)  If  the  assumptions  of  part  (i)  are  satisfied  and  if,  writing  1^  =  [un*zn) 

if  z  >  u  ,  and  [z  ,u  )  otherwise, 
n  n  L  n  nJ 


(4.22) 


t-rVar(  S.  ‘{X  €I  >  ] 
n  n  1=1  1  1  nJ  J 


-»  0. 


for  any  non- random  {z  }  with  c  /A  (z  -u  )  bounded,  then 

1  nJ  s  n  n  v  n  nJ 


N  (X<n)) 

En(X<"),  - 


n 


n 


N  (u  )  D 

n  n'  „  .  .  P  _ 

•  E  (u  )  -»  0 
nl  ny 


,  X 

n 


and  hence 


(4.23) 


N  (X(n)) 
n'  c  ’ 


—  E  (X^)  ^  N(0, 1) 


n 


n'  c 


n 


Proof .  (1)  Setting  a^=l,  ctg—P  in  Theorem  3.5,  it  follows  that  (4.8)  holds. 

Since  also  the  other  conditions  of  Lemma  4.1  (i)  are  satisfied,  (4.21)  follows, 
(ii)  Clearly 


|S  (z  )  -  S  (u  )|  =  | 
1  nv  ny  nv  n/ 1  1 


c  X 
n  n 


(5 

L  i=l 


<VV 


-nil 


<Xl«n 


,]'■ 


and  hence,  by  Lemmas  4.2,  4.1  the  result  follows  if  we  prove  that  the  righthand 
side  of  the  expression  above  tends  to  zero  in  probability. 

Proceeding  along  similar,  but  somewhat  cruder  lines  than  in  Lemma  2.1, 
split  the  integers  between  1  and  n  up  into  [n/kn]  "intervals"  of  length  r  , 
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with  one  shorter  interval  remaining.  As  in  Lemma  2.1  the  sums  of  the  1,.,  , 

'  1  n' 

for  i  belonging  to  the  first,  third,  ...  interval  (the  "odd  intervals")  are 
asymptotically  independent,  and  it  hence  follows  from  (4.22)  that  the  sum  over 
all  i  belonging  to  the  odd  intervals  tends  to  zero.  Similarly,  the  sum  over  all 
i  belonging  to  "even  intervals"  tends  to  zero. 

Finally, 

*I1(X.€I  }  ~  41{X  €1  >1  *  ?1{X  €1  }  =  l?(zn)  ~  ^un^  ~  Cnn  ^Zn~UJ/P’ 

'in-'  1  i  nJ  1  i  nJ 

by  (4.16).  Thus  the  expectation  of  the  sum  over  i  belonging  to  the  "remaining 
short  interval"  is  bounded  by 


r  - - —  |z  -u  |  -*  0, 

n  nJc~\~  n  n 
n  n 

since  r  /n  -»  0  and  Vc  /A  (z  -u  )  is  bounded.  This  completes  the  proof  of  part 
n  nn  nn 

(ID-  ■ 

5.  Estimation  of  A 

n 

For  inference  purposes  it  is  of  course  desirable  to  estimate  the  basic 
unknown  variance  A  .  Natural  estimators  are  given  by 

k 

(5..)  xn  =  (N,^))-1  £  [<YZA  -  K 

where  I.  is  the  interval  ((i-l)r  +1,...  ir  )  and  z  is  either  the  nonrandom 
i  vv  '  n  n'  n 

<Cn> 

level  or  the  random  level  Here  for  simplicity  we  consider  the  former 

a. 

case  and  show  that  in  Theorem  4.3  A^  may  be  replaced  by  the  estimator  A^  under 

~  p 

appropriate  conditions.  This  will  clearly  be  the  case  if  -*  1  which  will 

be  shown  to  hold  at  least  for  sequences  {c^}  satisfying  further  conditions 
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including  a  strengthening  of  (3.8),  viz. 

(5.2)  c  =  o(k  X  ). 
v  1  n  v  n  n' 

p 

Note  that  since  N  (u  )/c  -»  1  the  divisor  N  (u  )  in  (4.1)  may  be  replaced 

nv  nJ  n  nv  nJ  v  ' 

by  c  so  that  it  is  sufficient  to  show  that 
J  n 


,-l 


,2  P 


(5.3)  (cnXn)  2  {  2  [(X  -un)+  -  P1(X  >u  ,]  -  On-P)Nnd ,)}‘  -  1 

i  j€Ij  j  n' 

where  N  (I.)  =2  l,v  *  . .  Using  the  notation  of 

"  1  j€I,  'YV 

'P(x)  =  x+  -  pl(x>0 (5-3)  becomes 


Section  2  with 


(5.4)  2[Zj  -  (cn\)~*{Pn-p)  Nn(Ii)]2  S  1. 

Now  (5.4)  will  hold  if  both 

(5.5)  2  Z2  S  1 


and 

(5.6)  (cn  Xn)_1  (bn~P)2  ^(NJIj))2  l  0 

The  following  lemma  shows  that  the  Z^  may  be  assumed  independent  in 
proving  (5.5). 


Lemma  5. 1  Assume  the  conditions  of  Lemma  3.1  for  <//^(x)=x+,  and  ^2^x^=^(x>0)  ’ 
and  let  ^(x)  =  <^(x)  -^^(x) .  Let  X^^,X^2^,Xn  ^  lne^  as  in  (2.3)  relative 
to  respectively  and  let  (5.2)  hold.  Then  (with  Section  2  notation) 

2  Zj  -  2  ^  0.  It  then  follows  that  (5.5)  holds  if  it  holds  with  the  Z^ 

assumed  independent. 


Proof :  With  =  Z^  -  we  have 
(5.7)  2  Z2  -  2  =  22  W  Zl  +  2  V^. 


-  24  - 


Defining  ,  V^2^  with  respect  to  ^ ^  as  ^ni  *s  defined  relative  to  'p,  we 
have 


V  =  (X(1)/X  )Vj)  -  P(X(2)/X  )V2) 

ni  v  n  n7  ni  v  n  n7  ni 


so  that 


22  V2.  =  k  2V2.  £  KX_"k  (X^W})2  +  X^2^V^2^2). 
ni  n  nl  n  nv  n  nl  n  nl  ' 


i  Kc  8  /(nX  )  +  o(l). 
n  n  v  n7  v  7 

by  (3.7).  applied  to  and  \p^ •  This  tends  to  zero  by  (5.2)  and  the  basic 

op  2 

assumption  k  €  /n  -*  0.  Hence  2  VT  -*  0.  Further  22  Z.  is  bounded,  by  a  similar 
^  n  n  i  i 

2 

argument,  using  (3.5)  and  (5.2)  so  that  2Z^  is  tight  and  hence 
|2  Vj  Zj  |  1  (2  V^)^(2  Z2)^  So.  The  first  statement  thus  follows  from  (5.7)  and 
the  second  by  the  argument  used  in  Lemma  2.1  since  e.g. 

|2exp(it2  U?)  -  IT2exp(itU?)|  i  16k  a  -  -»  0.  ■ 

J  J  n  n*  n 

The  main  result  of  this  section  now  follows  readily.  In  this  =  mn  ^ 
will  be  used  to  denote  the  kth  central  moment  2(Zj-2Zj)  of 

Z.  =  (c  X  2*V(X  -u  y  . 

1  v  n  n7  j_i  J  n  + 

Theorem  5.2  Let  F  satisfy  (4.3).  Let  the  basic  assumptions  and  (2.5)  hold  for 
^(x)  =  x+.  *2(x)  =  l(x>0).  ^  write  +(x)  =  ^(xj-p^fx) .  Let  X^1^  ,X^2^  ,Xn  be 
defined  as  in  (2.3)  relative  to  yp^.^.yp  respectively  and  suppose  that 

X^k^£  KX^,  k=l,2,  n=l,2,3 .  Assume  that  cn^n  — *  00 .  ^nmn  4  and  (5.2) 

holds  both  as  stated  and  with  X^  7  replacing  X^.  Then  Xn/Xn  1  and  hence 

A 

(4.23)  holds  with  X^  replacing  XR. 

Proof As  noted  above  it  is  sufficient  to  show  that  (5.5)  and  (5.6)  both  hold. 
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Write  2Zni=m.  By  (3.4)  applied  to  ^  and  \p^  it  is  readily  seen  that 

(5.8)  k  | m |  £  K(c  /X  )* 

v  ’  n'  '  v  n  n' 

2  2  2 
so  that  k^m  — >0  by  (5.2),  and  hence  £(2Z^)  =  kn(varZn^+m  )  =  l+o(l).  Thus 

2 

(5.5)  clearly  follows  if  it  is  shown  that  var(2  Z^)  -*  0.  Now  assuming 
independence  of  the  Z^  by  Lemma  5.1,  it  is  readily  checked  that 

var  1  Z j  = 

The  first  term  k^^  tends  to  zero  by  assumption.  The  second  is  dominated  by 
3/4  1/4 

4kn  m^  m  =  o(l)  k^  m  which  tends  to  zero  by  (5.8)  and  (5.2).  Since  m2=l/kn 

2 

the  final  term  is  4m  which  also  tends  to  zero  by  (5.8)  and  (5.2).  Hence  (5.5) 
follows. 

Finally  to  show  (5.6)  note  (defining  Z^2^  as  Z ^  but  with  respect  to 

that 

‘(CnV'1  SOW)*  -  k„<^2)/V‘Zn2l2 

$  K(1  +  c  /(X(2)k  )) 
v  n  v  n  n" 

by  (3.5)  and  the  assumed  boundedness  of  X^2^/X  .  Hence  it  follows  from  (5.2) 

n  n  v  ' 

121  —1  2 
with  X'  1  for  X  that  the  means  of  the  random  variables  (c  X  )  2(N  (I.))  are 

n  n  v  n  n'  v  nv  i// 

/N  p 

uniformly  bounded,  and  hence  these  r.v.’s  form  a  tight  sequence.  Since  Pn  P 
(cf  remark  after  (4.14)),  (5.6)  now  follows. 


k  var  Z. 
n  1 


i  ^(ro^+^m^m  +  4m2m  ) . 


6.  Tail  and  quantile  estimators 

An  important  reason  for  interest  in  the  estimators  from  the  previous 
section  is  estimation  of  small  tail  probabilities  and  large  quantiles.  For 
example  quantiles  are  important  for  design  of  engineering  structures  and  tail 
probabilities  give  the  reliability  of  existing  structures. 
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Thus,  the  problem  is  to  use  observations  X, . X  to  estimate 

In 

probabilities  or  quantiles  which  are  well  outside  "the  range  of  the  sample"  so 
that  non-par ame trie  methods  do  not  apply.  Our  starting  point  will  be  the  tail 
condition  (3.1),  viz. 


(6.1) 


1  -  F(x+t) 
1  -  F(t) 


-*  e 


-x//3 


t  -»  «,  X  €  IR. 


To  obtain  the  estimators  we  will  just  assume  equality  in  6.1.  and  replace  0  by 

A 

0^  and  l-F(z)  by  Nn(z)/n,  with  Nn(z)  the  number  of  exceedances  of  z,  as  before. 

In  this  Section  we  will  only  consider  the  choice  z  =  X^n^ ,  for  sequences  (c  }, 

c  n 

n 

with  c  h»,  c  /n  -*  0,  although  the  results  from  Section  4  indicate  this  is  not 
n  n 

crucial . 

Sometimes  interest  is  not  in  tails  and  quantiles  of  the  observations 
themselves,  but  in  the  corresponding  quantities  for  maxima  over  some  period,  of 
length  N,  say.  For  example,  in  the  water  level  data  studied  in  Section  8 
below,  measurements  are  taken  twice  daily,  at  high  tide,  but  the  code 
stipulates  that  the  probability  of  flooding  the  dike  during  a  year  should  not 
exceed  1/10,000.  Thus,  the  quantity  needed  is  the  p-th  quantile  of  the  yearly 
maxima,  for  p=l/10,000.  To  obtain  estimators,  we  will  assume  an  extremal  index 
0>O  exists,  as  discussed  in  the  introduction,  so  that 


(6.2)  P(Mj^  >  z  +  x)  *  1  -  F(z+x)N0 

~  l-exp{-N0(l-F(z+x))) 

*  l-exp{-N0( l-F(z) )e_x/P) , 

in  the  range  considered.  Estimators  are  again  obtained  by  assuming  equality 

A  A 

and  replacing  0  by  0^  and  0  by  0n>  l-F(z)  by  Nn(z)/n. 

Thus  far,  we  have  discussed  four  cases,  i.e.  estimation  of  tails  and 
quantiles  for  individual  random  variables  and  for  maxima.  There  is  also  a 
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fifth  case  which  we  will  comment  on  briefly,  when  N  is  much  larger  than  n,  and 
one  wants  to  estimate  the  distribution  of  M^.  We  will  treat  the  four  cases 
separately,  the  ideas  each  time  being  the  same  but  the  details  somewhat 
different. 

The  discussion  will  be  in  terms  of  exponential -type  tails  satisfying 
(6.1).  However,  an  extension  to  regularly  varying  tails  is  only  a  matter  of 
straightforward  translation.  This  is  briefly  discussed  in  the  next  section. 


1.  Tati  estimation  As  outlined  above,  the  tail  probability 


p  =  pn  =  1_F(y). 


for  y=yn  increasing  with  n.  is  estimated  by 


(6.3) 


P„  '  -S  e*p(-(y-x'n))/Pn> 

n 


—  exp{- W  /P  }, 
n  n  n' 


W  =  y  -  X(n>. 
n  J  c 

n 


A  simple  "propagation  of  errors"  calculation  for  var  (e  }  in  terms  of  the 

A  A 

mean  and  variance  of  /?n  suggests  that  the  asymptotic  variance  of  pn  can  be 


estimated  by 


(6.4) 


A  ~  -2  2  ~-4  A  ~2W  /f5 

A(p)=nZWc/3^Xe  nn 
n  n  n  n 


^  P  ’  X  c  1 
n  n  n  n  rn 


for  Xn  given  by  (5.1).  We  will  prove  asymptotic  normality  when  n-»*>,  y  =  yn  -*  09 , 

np  -*  0.  The  last  condition  in  particular  means  that  w  -*  00 ,  for  w  =  y  -  u  , 
n  n  n  n 
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where  u  (=U(n/c  ))  satisfies  (4.4),  as  before.  Further,  put 
n  n 

g(t)  =  1/0  -  F’ (t)/( 1-F( t)) . 


Theorem  6 . 1  Suppose  the  conditions  of  Theorem  4.3  are  satisfied  and  that 

np  -*  0.  If  furthermore 
n 


(6.5) 


I  ~  P 

sc  /X  sup  g(t+u  /p)  -»  0,  n  -» 

n  n  t*0  n 


(6.6) 


/ c  So. 

n  n  n 


(6.7) 


A  A  -1  /ft  A  J 

X(pn)-1/25  (pn  -  pn)  ^N(O.l). 


Proof  It  follows  from  Vc  /X  -♦  ®.  (4.19)  and  Theorem  3.5  that 

n  n 

W  -  w  =  u  -  S  0,  and  since  np  -*  0  implies  w  -»  00 ,  also  W  /w  Si. 

nnnc  *n  Kn  nn 

n 

Further,  it  then  follows  from  Theorem  4.3  and  (6.6)  that 

Hence  it  is  sufficient  to  prove  asymptotic  normality  with  Wn  replaced  by  w^  and 

A* 

Pn  replaced  by  P  in  (6.4). 


Now,  write 


F  =  i/c  /X  E  (X(n))  =  7c  /X  (p  (X(n))  -  P  ). 
n  n  n  nv  c  7  nn  vrnv  c  7  n7 
n  n 


G  =  /c"TX~  (X^  -  u  ). 
n  n  n  v  c  n7 

n 


so  that  F  S  N(0.1)  and  G  is  tight,  by  Theorem  4.3,  and  Lemma  4.2  and  (4.19). 


Then, 
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pn  = 


c  w  /5T  G  f>T  F 

n  g  t  n  n  t  t  n  ^ 

—  exP  {-  (« - zz — )  /  O  +  — zz — )> 


n 


P  Jc 

n 


P  ST 
n  n 


c  -w  /p  /Tf»  STg 
~  n  n  .  .  nnn.  nn, 

~e  '^^F- + 


n 


n 


since  F  and  G  are  tight,  and  since 
n  n 


p  S  p,  —  0.  w2  —  -»  0, 

n  c  n  c 

n  n 


by  assumption,  where  ~  means  that  the  ratio  of  the  two  sides  tends  to  one  in 

probability.  Further,  since  c  /n  =  F(u  ),  p  =  F(u  +  w  ). 

n  v  n'  *n  v  n  nJ 


c  -w  /p  c  -w  /p  w  /p  F(u  +w  ) 

n  n  n  n  , ,  n  v  n  ny 

—  e  -p  =  —  e  (1-e  - . 

n  n  n  — 


F(u  ) 
v  n' 


Hence 


a  a  1  try  a 

(6.8)  ^(Pn)"1/2(Pn-Pn) 


X,  PG  p2/^ 

(F  +  -2-}  +  - -  < 

x  n  »  I— 


>1  X 


n 


w  Jx 
n  n 


u  /J3  _ 

e  n  F(u  +w  ) 
v  n  n' 


u  /p  _ 
e  n 


—  ~  p 

with  F  =  1  -  F,  as  before.  Since  G  is  tight,  w  -»  00 ,  and  X  /X  -»  1  by  Theorem 

n  n  n  n 

5.2,  it  follows  that  the  first  term  is  asymptotically  standard  normal. 

It  thus  only  remains  to  prove  that  the  last  term  tends  to  zero.  By 
Taylor ‘s  formula  (cf.  the  proof  of  — ►  0  in  Lemma  4.1),  this  term  is 

asymptotically  equivalent  (in  the  sense  that  the  ratio  of  the  two  expressions 
tends  to  one  in  probability)  to 


fa  i. 


log 


n 


w  /p 

e  n  F<y*n> 

F(u„) 


=  -  P 


f  S0  s(V”n)ds 


n 


0. 
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by  (6.5). 

Remark  6.2  (i)  Since  An/An  ^  1  and  ^*r/wn  ^  *•  (6-5)  and  (6.6)  might  as  well 

A 

have  been  stated  with  X  ,  W  replaced  by  X  ,  w  .  However,  (6.5),  (6.6)  have 

nn  nn 

the  advantage  that  they  involve  only  observed  quantities,  except  for  the 
function  g. 

/V  A 

(ii)  From  the  form  of  X(pn)  and  (6.6)  it  follows  that  the  relative  error 

/v 

Pn/pn~l  tends  to  zero. 


2.  Quantile  estimation  Let  =  U(l/p)  be  the  (l-p)-th  quantile  (assumed  to  be 
unique)  of  the  marginal  d.f.  F  of  the  Xj’s,  so  that  F(x  )  =  1-p.  Reasoning  as 


before,  x^  may  be  estimated  by 


(6.9) 

for  p  =  p  — 
n 

estimated  by 

(6.10) 


x  =  (5  log  -n  +  X(n)  . 
p  n  np  cn 

0,  with  np^  — »  0.  This  time,  the  asymptotic  variance  is 


A  /V  Q  /V 

X(x  )  =  (log  X  /c  . 
v  p'  1  np'  n  n 


Theorem  6.3  Suppose  the  conditions  of  Theorem  4.3  are  satisfied.  npn  — *  0,  and 
that  F  satisfies  A4  of  the  appendix.  If  furthermore 


(6.11) 

with  a(u  )  as  in  A4 
v  n' 


J  c  /X 

_ n  n  f  . 

log(cn/npn)  a^Un' 


0. 


(6.12) 


r  ~  ,-i/2 
Mxp> 


x  )  ^N(0,1). 

Hn 


Proof  With  the  same  notation  as  in  the  proof  of  Theorem  6.1, 


(6-13^  log(cn/npn)  ^Xpn  ~  Xpn^  Fn  +  log(cn/npn)^ 


sc  /A  c 

_  — -  n  n  {x  -  u  -  p  log  — —  }  . 
log(cn/npn)  pn  n  Kn  npR 

Here  the  first  term  is  asymptotically  standard  normal,  since  ^-»N(0,1), 
is  tight,  and  c^/np^  — »  ®. 

To  prove  that  the  second  term  tends  to  zero,  first  note  that  -log(l-F(un) 
=  log(n/cn)  by  (4,4).  Let  V  be  the  right-continuous  inverse  of  -log(l-F(un)) 
and  a(t)  =  a(V(t)),  as  in  the  proof  of  Lemma  Al.  Then,  for  a(t)  =  a(log(t)), 


we  have  that 


a(§-)  =  a(V(log(^-))) 
n  n 


and  hence 


(6.14) 


>lc  /A 

■: - 7 - t - 7  a(n/c  )  — ►  0,  n  — *  m 

log(cn/npn)  v  n' 


Further,  x  =  U(l/p  ),  u  =  U(n/c  ),  so  that  the  second  term  in  (6.13)  may  be 
Pn  n  n  n 

written  as 


c  /np 
n  n 


(6.15) 


'np  c  /np 

S  n(s  2-  u-(s  2_)  -  P)  i  2d  Hf  n  a(sn/cn) 
Inn  1 


by  the  appendix  (Lemma  A 2)  and  the  following  remark.  Now  A9  which  follows  from 
A4  implies  that  for  sufficiently  large  n  and  e  >0,  with  p  as  in  A3 


a(sn/cn)  i  a(n/cn)(l+t)s 


-Pp+e 
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uniformly  for  s  2  1  ([5]  Proposition  1.7.5).  Hence  (6.15)  is  bounded  by 

c  /np 

2da(— )  /  U(l+e)s~Pp+e  — . 

C  -  s 

n  1 

Together  with  (6.14)  this  shows  that  the  second  term  in  (6.13)  tends  to  zero  in 
probability  if  A3  and  A4  hold.  p 

Remark  6.4  (i)  Note  that  for  the  exponential  distribution  A4  holds  with  c=0, 

any  p>0,  and  any  a  satisfying  A8 

A 

(ii)  Similarly  to  Remark  6.2,  X may  be  replaced  by  Xr  in  (6.11). 


3.  Estimation  of  the  tail  of  For  this  we  need  an  estimator  0  .  say,  for 

N  n 

the  extremal  index  0>O  (assumed  to  exist).  An  example  of  such  an  estimator 
(studied  in  [8])  is 

r 


5n  ■  W  =  FT7TT  X  Vzn>’ 

nv  nJ  i=l 


where  Tj^(zn)  =  1  if  there  is  at  least  one  exceedance  of  Zn  by  the  X^’s  for  j  in 

the  i-th  block,  J. ,  and  zero  otherwise.  As  before  z  =  is  the  natural 

i  n  c 

n 

choice.  Further,  let 


0  =  —  g-q.  (u  )  , 

n  c  ’lv  n' 


A>  *V 


with  cn*  un  aa  in  (4.4),  and  define  an  auxilliary  quantity  0r  =  0n(y.N)  by 
assuming  equality  in  (6.2),  i.e.  assume  that 


PfMj,  >  y)  =  l-exp{-N0n(l-F(y))} 


=  l-exp{-N0  p  } , 
1  nn' 


so  that  0 


n 


0.  Here  we  will  not  confine  ourselves  to  some  special  form  of 
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0,0,  but  will  only  assume  we  have  some  estimators  0  and  constants  0  which 
n  n  n  n 

satisfy 

0  -»  0  >  0. 

n 


(6.16) 

and  (6.20)  below. 


Ic/X  ^  p 

<en-0n>  1  °* 
n 


The  obvious  estimate  of  Pn(N)  =  P(M^>y),  for  y=yn  ",  is 


A  A 


(6.17)  pn(N)  =  l-exp(-N  0r  pn). 

for  pn  given  by  (6.3).  Its  variance  may  be  estimated  by 

a  A  n  Aft  A  O  a  A 

(6.18)  X(pn(N))  =  «  0^(l-pn(N))2  X(pn), 

A  A 

with  X(Pn)  given  by  (6.4). 

In  the  case  when  pn(N)  is  small  one  may,  by  Taylor’s  formula,  use  the 
alternative  estimator 


p  (N)  =  N  0  p  , 
n  '  n  n 


and  estimate  its  variance  by 


9  A9  A  A 

rr  0Z  x(P  ). 

n  v  n' 

P 

Theorem  6.5  Suppose  Npn  ->0,  (6.16)  holds,  and  the  conditions  of  Theorem  6.3 
are  satisfied.  Then 


(6.19)  X(in(N))~*  (;n(W)  -  PtMj^  >  y))  =  Tn  +  kin)~*  ' 

where  Tr  ^  N(0,1).  If  in  addition 


j; 


***  p 

(6.20)  — ~  --  (0  -0  )  £  0. 

w  n  n 
n 


then 


MPn(W))"*  (pn(N)  -  PfM^y))  ^N(O.l) 
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Remark  6.6  (i)  To  establish  (G.20)  is  a  separate  problem  in  probabi  lity 

theory,  and  clearly  depends  on  which  particular  process  one  is  considering. 
However,  from  a  practical  point  of  view,  (6.20)  requires  that  N  is  large 
compared  to  typical  cluster  sizes.  Of  course,  (6.17)  should  only  be  used  when 


this  is  the  case. 
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~  P 

(ii)  The  assumption  Npn  -»  0  (or.  equivalently,  that  N  pn  -»  0)  is  only  used  for 
(6.21).  However,  (6.21)  obviously  is  also  satisfied  in  more  general 

A 

circumstances,  and  Pn(N)  seems  useful  also  for  N  such  that  P(M^  >  yn)  does  not 
tend  to  zero.  This  corresponds  to  the  fifth  case  mentioned  in  the  beginning  of 
this  section.  ■ 
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(6.24)  X(xp(N))“*  (0n-en)  So. 


then 


Mxp(N))“*  (xp(N)  -  xp(N) )  ^N(O.l) 


Proof :  Define  the  function 


r(9)  =  •  0  <  0  ^  1. 


Then 


V  N>  *  XT(8n)  • 

with  xp  the  (l-p)-th  quantile  of  F,  as  before.  Further,  comparing  with  (6.9) 


we  have  that 


Hence 


Xp^N)  “  X*(0n)  ‘ 


VN) "  VN)  =  (x*(0n) "  x*(0n)}  +  (x»(en) "  xx(en)) 


=  P  log  n  +  (*w(e  y  ~  xT(0n))- 
n 


~  A 


Now,  since  0/0  -*  1,  it  follows  from  Theorem  6.3,  by  straightforward 

n  n 


arguments,  that 

T  =  X(x  (N))  "  (x  ~  .  -  x  ~  >) 
n  v  Pv  ”  v  *(0J  v(9n)J 

^  N(0.1). 


n'  '  n' 

n  -*  «, 


which  proves  (6.23).  Since  Pn  -*  P,  (6.24)  implies  that  the  last  term  in  (6.23) 
tends  to  zero  in  probability,  which  concludes  the  proof  of  the  theorem.  ■ 


7.  Regularly  varying  tails 


We  now  very  briefly  indicate  how  the  results  should  be  translated  when  the 
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tail  1-F  decreases  in  a  "polynomial"  (regularly  varying),  rather  than 
exponential,  manner.  More  specifically  condition  (3.1)  is  replaced  by 


(7.1) 


l-F(xt) 

l-F(t) 


- 1/0 


t 


eo,  x  >  0. 


Clearly,  if  a  positive  random  variable  X  has  distribution  function  F 
satisfying  (7.1)  then  the  distribution  function  of  logX  satisfies  (6.1)  with 
the  same  (3.  The  following  condition  replaces  (4.3)  for  the  present  case. 


(7.2) 


-F(t) 


1_ 

P 


as  t 


00  o 


For  the  convenience  of  the  reader,  we  reformulate  some  of  the  results  of 
Sections  4  and  6  for  distribution  functions  satisfying  (7.1)  or  (7.2). 
Define  now  (cf.  (4.1)) 


(7.3) 


n 


n 


N  (z  )  2  (log  Xi”zn>+  * 

nv  n7  i=l 


where  N  in  the  present  section  has  been  redefined  as 
n 


n 

N  (x)  =  2  1,,  v  \  \ • 

n  i_i  (log  Xt>x) 

A 

Let  Xn  be  as  in  (2.3)  and  be  as  in  (5.1),  but  with  X^  replaced  by 
logXj.  Theorem  4.3  may  be  immediately  restated  in  the  present  context.  For 
example  if  the  tail  condition  (7.2)  holds  and  the  other  conditions  of  Theorem 
4.3  are  satisfied  with  X^  replaced  by  log  X^ ,  then 

N  (z J  .  . 

n 


(7.4) 
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for  z  =u  and  z  =log  ^  where  P  =  —  £(logX,-u  )  .  Similarly  Theorem  5.2 
nn  n  c  n  c  1  n'+ 

n  n 

A 

may  be  simply  adapted  to  give  a  result  under  which  (7.4)  holds  with 
replacing  X  . 

Next  consider  the  analogue  of  Theorem  6.1.  In  addition  to  the 
specifications  above  let 


(7.5) 


P  l-F(eC) 
=  1-F(ey) 


and  let  p^  and  be  as  in  (6.3)  but  with  X  replaced  by  log  X.  Then  the 
formulation  of  the  Theorem  6.1  goes  through  without  further  changes. 

To  adapt  Theorem  6.3,  replace  A4  by 

tx  F '  ( tx)  1_ 

with  t  >  0  satisfying 

(7.7)  nn*  ifrr1  = x_p 

tH(n 

Let  Xp  =  log  U(l/p)  and  let  x^  be  as  in  (6.9)  but  with  X  replaced  by  log  X. 
Then  the  formulation  of  Theorem  6.3  goes  through,  with  a(t)  replaced  by  -ir(et) . 


8.  Application  to  water  level  data. 

Reliable  high  tide  water  levels  are  available  from  about  1885  onwards  at 
five  stations  along  the  Dutch  coast.  Ve  restrict  ourselves  to  the  station  Hoek 
van  Holland  (part  of  the  city  of  Rotterdam)  and  observe  that  high  water  levels 
are  mainly  due  to  wind  storms.  All  data  obtained  outside  winter  periods, 
October  1  -  March  15,  are  removed:  significant  wind  storms  mainly  occur  during 
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the  winter. 

One  is  interested  in  the  tail  of  the  marginal  distribution,  in  view  of  the 
design  of  sea  dikes.  Since  high  levels  are  mainly  due  to  wind  storms,  there  is 
short  range  dependence  in  the  data  -  the  influence  of  a  severe  wind  storm 
typically  lasts  several  days  -  but  not  much  long  range  dependence.  The  theory 
of  clustering  of  high  values  and  extremal  indices  seems  suitable  for 
description  of  the  available  data.  Also  at  first  glance  the  exponential 
distribution  gives  a  reasonable  fit  and  the  data  seems  stationary.  Since 
extrapolation  outside  the  range  of  observations  is  quite  critical,  it  seems 
wise  to  consider  a  larger  class  of  distributions  than  just  the  exponential  one, 
so  we  adopt  assumption  Al.  In  order  to  single  out  the  influence  of  wind  storm 
activity  we  did  not  use  the  original  observations  but  so-called  set  up  values, 
that  is  the  difference  between  the  observed  value  and  the  value  predicted  on 
the  basis  of  the  movements  of  sun.  moon  and  earth  ("astronomical  levels"). 

In  this  way  a  data  set  of  size  17,544  covering  the  years  1887-1985  is 

A  A 

obtained.  The  estimates  /?n  and  0r  were  calculated  (cf.  (4.1)  and  (6.2))  for 

various  levels  u  and  the  95%  two-sided  confidence  interval  for  0  obtained, 
n 

Figure  8.1  shows  and  its  estimated  confidence  interval  against  the  chosen 

levels  u  .  The  blocksize  r  =30  has  been  used  for  the  intervals  (c.f.  formula 
n  n 

(5.1)).  However,  the  intervals  were  rather  insensitive  to  changes  in  r  .  As 
expected  the  value  of  Pn  fluctuates  substantially  when  ur  is  high,  since  then 
few  observations  are  used.  From  a  theoretical  point  of  view  a  bias  could 
develop  when  ur  is  low  (since  then  may  differ  significantly  from  f3) .  This 
phenomenon  does  not  seem  to  occur  in  the  range  considered  here.  Figure  8.2 

a 

shows  plotted  against  the  chosen  level  ur.  The  approximate  monotonicity  of 
this  function  points  towards  a  serious  bias  of  the  estimation  method  for  low 
levels.  However.  0n  is  clearly  less  than  one. 
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Fig.  8.1  Estimates  Pn-Pn(u)  and  approximate  95%  confidence  intervals  for  p, 
based  on  n=17,544  tide  level  measurements  at  Hoek  van  Holland 


Fig.  8.2  Estimates  0  =0  (u)  of  the  extremal  index  0  for  the  same  tide  level 

n  n 

measurements  as  in  Fig.  8.1. 


9.  Simulations 

A 

To  assess  the  behaviour  of  P  for  small  samples  the  following  processes 


were  simulated. 
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EMAO:  X.  =  e 

t 


EMA1 :  Xt 


e  +  e 
t  t+1 


{et}  i.i.d. 
exponential  r.v.’s 


PMAl:  Xt 
PARI :  Xt 


et  +  et+l 

(.958Xt_1  +  et)/1.95  . 


(e  }  i.i.d.  Pareto 
rTv.s  with  P(e  >x)=l/x 
for  x  >  1. 


The  first  two  of  these  processes  have  asymptotically  exponential  tails,  and  the 
last  two  have  asymptotically  Pareto  tails.  All  four  have  /3=1,  and  the  0-values 
are  1,  1.  0.50,  0.51. 

The  simulation  for  the  PARl-process  was  started  with  X  =  0.  discarding  the 
first  500  values.  To  give  an  extra  check  on  the  results,  several  of  the 
simulations  were  independently  programmed  and  run  twice,  using  different 
standard  random  number  generators.  For  each  replication  the  quantity 


V  = 


A 


was  computed,  for  the  first  two  processes  from  (4.1),  (4.2)  and  (5.1),  and  for 

the  PAM1  and  PARl-processes  from  the  formulae  in  Subsection  6.5,  using  the 

logarithms  of  the  observations.  In  the  simulations  (except  for  fig.  9.1b))  the 

sample  size  was  n  =  4000  and  cn  was  chosen  as  np  for  p  =  .1  and  p  =  .05,  i.e. 

c  was  400  or  200.  Here,  we  used  the  value  4000  to  yield  c  ’s  for  which  the 
n  n 

deviations  from  the  normal  limit  still  are  quite  clear. 

All  the  simulations  were  performed  both  for  "fixed  c  "  and  "fixed  u  ". 

n  n 

However,  as  was  expected,  the  differences  between  the  two  cases  were  quite 
small,  and  hence  only  the  "fixed  cn"  results  are  exhibited  below. 

According  to  Theorems  5.2  and  4.3  (ii).  V  should  be  approximately  normally 
distributed.  The  results  of  the  simulations  are  given  in  Tables  9.1,  9.2,  and 
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Figure  9.1  below. 


(cn=np) 

v  ( 
nv 

-1.64) 

i>(1.64) 

u(-l .96) 

u( 1.96) 

P 

EMA0 

.1 

1 

.07 

.04 

.04 

.02 

.05 

1 

.08 

.03 

.05 

.01 

EMA1 

.1 

.07 

.04 

.05 

.02 

.05 

.08 

.04 

.05 

.01 

.1 

.10 

.02 

.06 

.01 

PMA1 

.05 

.08 

.03 

.05 

.02 

.1 

.14 

.04 

.09 

.01 

PARI 

.05 

.16 

.02 

.12 

.01 

Normal 

probability 

.05 

.05 

.025 

.025 

Table  9.1.  Values  of  u(x)  =  #{ simulations  with  V£x}/2000  and  u(x)  =  l-u(x) 

based  on  2000  simulations  of  each  of  the  eight  cases,  for  sample  size  n=4000, 

fixed  c  ,  and  block  size  r  =20. 
n  n 


block  size 

r 

u(-l .64) 

u(1.64) 

u(-l .96) 

u(l .96) 

2 

.13 

.04 

.08 

.02 

A 

.11 

.03 

.07 

.01 

10 

.10 

.03 

.06 

.01 

20 

.10 

.02 

.06 

.01 

40 

.10 

.03 

.06 

.01 

Normal 

probability 

.05 

.05 

.025 

.025 

Table  9.2  Values  of  u(x)  =  ^{simulations  with  V  £  x}/2000  and  u(x)=l-u(x) 
based  on  2000  simulations  of  each  of  the  seven  cases,  for  the  PMAl-processes. 
with  sample  size  n=4000  and  fixed  cn  =  np  for  p  =  .1,  i.e.  cr  =  400. 
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Fig-  91  Histograms  and  normal  probability  plot  of  V-values  from  2000 

simulations  of  each  of  the  sample  sizes  n  =  4000  (Fig.  9.1. a)  and  n  =  8000 

(Fig.  9.1.b)  for  the  PMAl-process.  In  the  simulations  c  was  fixed,  and  equal 

n 

to  np  for  pss.l,  the  block  size  was  r  =  20,  and  the  smooth  curve  in  the 

n 

histograms  is  the  standard  normal  density. 

From  the  results  above  it  can  be  seen  that  even  for  c  =  400  and  n  =  4.00 

n 

there  are  clear  deviations  from  the  limit.  A  main  reason  for  this  is 


variability,  and  to  some  extent  bias,  of  the  X  -estimator.  In  addition,  the 

n 

stronger  dependence  in  the  PARl-process  also  seems  to  slow  down  convergence. 
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For  cn  =  800,  say,  (and  n  =  8000)  the  normal  fit  is  much  better  as  seen  from 
Fig.  9.1. 

It  is  clear  that  the  block  sizes  r  =2  and  r  =4  are  too  small  for  the 

n  n 

PMAl-process  (c.f.  Table  9.2).  However,  there  does  not  seem  to  be  much 

difference  between  r  =  10.  20  or  40,  even  if  one  also  looks  at  the  values  of 

n 

A 

X.  This  is  as  expected,  since  0  =  .51,  and  hence  clusters  on  the  average 
contain  about  two  exceedances. 
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Remark  Note  that  A4  implies  A3  if  c  ^  0. 
Lemma  A1  A4  implies 


1-Fft+x)  -x/P 

H.  ~  e - -  -c  e-^ 

t-t®  a(t) 


(x  C  IR) 


locally  uniformly. 

Proof  By  A1  (which  follows  from  A4)  with  H  :=  -log(l-F) 


ex/p  l-F(t+x)  -  j  H(t+x)  -  H(t)  -  x/p 


a(t) 

o(t) 

%[ 

F* ( t+s)  1 
l-F(t+s)  p  J 

ds 

a(t) 

-»  -c  e  ps  ds 

by  A4  and  [5.  Theorem  1.8  (ii)]  (or  [4],  for  c=0).  The  local  uniformity 
follows  from  the  local  uniformity  in  A4. 


Lemma  A^ 

1 .  c  =»  b  =>  a. 

2.  Let  V  be  the  (right  continuous)  inverse  function  of  -log(l-F).  Equivalent 
forms  of  a,b  and  c  are 

a ' .  Suppose 

A5  lim  V( t+x)  -  V(t)  =  p  x  (x  €  R) 

tH» 

for  some  positive  constant  p. 
b'  Suppose  V  has  a  derivative  V'  and 

A6  lim  V’(t)  =  P  . 

t-<» 


c'  Suppose  there  exists  a  positive  function  a  satisfying 
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A7 


11.  =  e~PpX  (x  €  R) 


for  some  positive  constant  p.  such  that 


A8 


lim 


V*  (t+x)  -  p 
a(t) 


de-ff(,x 


(x  €  IR) 


for  some  real  constant  d  = 


Proof  1.  The  implications  are  immediate  since  any  function  a(t)  satisfying  A3 
converges  to  zero  ( t  -*  ®) . 

2.  For  the  equivalence  of  a  and  a’,  see  [[5]  Proposition  1.7(9)].  The 
equivalence  of  b  and  b'  is  iimnedlate. 


c<=>c‘:  First  note  that  A3  and  A4  together  are  equivalent  to  A3  and  A4  with  x=0 
and  similarly  A7  and  A8  are  equivalent  to  A7  and  A8  with  x=0.  Let  a(x)=a(V(x}) 
and  let  V*~  be  the  right-continuous  inverse  of  V  (so  that  typically  V*"  = 
-log(l-F)).  Since 

_ I _ a 

V*(V(t))  1_F^ 

for  x=0  the  left  hand  side  of  A4  equals 


lim  - - - 

t-*»  PV*(V  (t)) 


UfitUzB  .  zl  lim  .  ZL  h. 

a(V*(t))  p2  a(V*’(t))  p2  iS 


a(s) 


Since  the  equivalence  of  A3  and  A7  is  immediate  from  a’,  it  now  follows  that  c 


holds  if  and  only  if  c'  holds. 


□ 


Remark,  (i)  Let  U  =  (1/(1-F))*’,  so  that  U(t)  =  V(log  t).  Then  A6  at  once 


translates  into  tU'(t)  -*  p,  and  A8  into 

*9  11m  W'W  =  dx^ 

»(log  t) 


(ii)  Let  Xj.  Xg. . • •  be  i.i.d.  with  distribution  function  F.  Condition  a 


is  equivalent  to 


P{max(X1> . . . ,Xn)  -  U(n)  $  x}  ■>  exp  {-e  x/^} 
(n  -»  00 )  for  all  x.  Condition  b  implies  that 


has  asymptotically  a  standard  normal  distribution,  where  -»  <»,  c^/n  -»  0  and 

{X^n^}"_i  are  the  descending  order  statistics  of  X^ . X^.  Condition  c  (cf. 

Lemma  A^)  is  sufficient  for  the  asymptotic  normality  of  Hill’s  estimator,  (cf. 
[3]  Theorem  3.1  and  Remark  4). 
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